Using permeability to estimate linear barrier permeability

anonymous for manuscript review

2025-07-09

Introduction: the ‘permeability’ package

The permeability package uses maximum likelihood estimation with animal biotelemetry data to statistically estimate the permeability of a linear barrier of interest.

Package functions take an input set of movement tracks with similar sampling rates and quantify crossings of a given linear barrier (Figure 1).


Figure 1. Diagram illustrating analysis steps using the permeability package with a set of movement data.
Figure 1. Diagram illustrating analysis steps using the permeability package with a set of movement data.


The central function, fitPermeability, uses maximum likelihood estimation to determine crossing probabilities and to generate an overall permeability score for the barrier of interest, together with confidence intervals. Values close to 0 represent a road that is impermeable and values close to 1 represent a highly permeable road to animal movement. Values greater than 1 are also possible, with a “hyper-permeable” barrier.

Additional functions can be used to a) estimate permeability of the barrier as a function of covariate(s), b) compare models, and c) summarize and visualize model results.

This vignette demonstrates use of package functions with both simulated and “real” animal movement data.

To load the package:

library(devtools)
install_github("pathprovidedonpublication", vignettes = TRUE)
library(permeability)

The ‘permeability’ method

Quantifying crossing probabilities

Given a set of consecutive animal locations (\(Z\)), “permeability” (\(\kappa\)) is modeled as the influence of a linear element (e.g., a road or barrier, \(B\)) on the animal’s redistribution kernel, aka the probability distribution of finding an animal one-time unit (or “step”) after a given observation. A barrier with a permeability of 0 results in a zero probability density on the other side, whereas a semi-permeable barrier results in a proportion of the distribution, scaled relative to the null distribution, occurring on the other side of the barrier. The amount that the redistribution kernel on the other side of the barrier is scaled relative to the null distribution can be captured with the parameter, \(\kappa\), with values ranging from 0 to 1, or even > 1 in the case of a “hyper-permeable” barrier.

Figure 2: Example of a possible redistribution kernel for two consecutive locations, Z, with \kappa values 0-1.
Figure 2: Example of a possible redistribution kernel for two consecutive locations, \(Z\), with \(\kappa\) values 0-1.


The probability at time t that an animal crosses a given barrier is given by:

\[c_t(Z_t, B, \kappa) = 1 - s_0(Z_t, B)^\kappa\]

where \(s_0(\cdot) = 1 - c_0(\cdot)\) is the null probability that the animal “stays” (does not cross barrier \(B\) from location \(Z_t\)). This formulation has the desired properties: when \(\kappa = 0\), the probability of crossing is 0, when \(\kappa = 1\), the probability of crossing is the null probability \(c_0 = 1-s_0\). As \(\kappa\) approaches infinity, the probability of crossing approaches 1.

Thus, given a set of \(k\) null steps \(\vec{S_i}\) where \(i \in \{1,2,...,k\}\) we take each location of interest \(Z_t\), add the steps \(S_i\), and simply count the number of relocations on either side of the barrier \(B\).

Once this null set is obtained, the number of null locations that crossed and didn’t cross the boundary can be counted (\(n_{c,t}\) and \(n_{s,t}\), respectively) and the estimate of the null probability of staying \(s_0\) found as:

\[\widehat{s_{0,t}} = \frac{n_{s,t}}{n_{c,t} + n_{s,t}}\]

The movement and barrier data can thus be reduced to a fundamental set of interacting observations, stored within a data frame that we define as the “crossing table”, where each row represents a single potential crossing event. At a minimum, the table contains a vector of actual crossing events \(c_i = \{0,1\}\), a column of estimated null crossing probabilities \(\widehat{c_{0,i}}\), and a column of estimated null “staying” probabilities \(\widehat{s_{0,i}} = 1- c_{0,1}\). The crossing table can further be annotated with any number of covariates, as detailed below.

Likelihood function

After obtaining the crossing table, the log-likelihood function of \(\kappa\) given movement locations \(Z\) and barrier \(B\), is:

\[{\cal L}(\kappa | Z,B) = \prod_{i=1}^n \left(I(c_i) (1 - \widehat{s_{0,i}}^\kappa) + (1 - I(c_i)) \widehat{s_{0,i}}^\kappa \right)\]

where \(I(c_i)\) is the indicator function of having crossed at time i, i.e. the first term represents the rows of the crossing table (movement steps) where the animal actually crossed the barrier and the second term represents the movement steps where the animal did not cross the barrier.

From this, the (log)-likelihood is readily maximized to estimate \(\kappa\) with approximate confidence intervals (via the Hessian of the likelihood at the maximum likelihood estimator).

Incorporating covariates

The permeability parameter \(\kappa\) can also be estimated as an exponential function of covariate(s) \(X\):

\[\kappa = \exp(\beta X)\]

Covariates may represent a feature of either the barrier (e.g., a static variable, such as variable structure with different barrier segments, or a dynamic variable, such as vehicle traffic or kilometer along the highway) or the animal movement (e.g., a static variable, such as animal sex, or a dynamic variable, such as season). Multiple covariates can be included in a multivariate model formula - however, users should be cautious of parsimony, over-fitting, and sample size. When fitting categorical data, users should ensure that there is a large enough sample size of steps proximate to the barrier to estimate \(\kappa\) for different category levels.

Null step creation

The choices made in creating null steps are important, as the distribution of step lengths sampled to create the null set of steps can impact the null probability of crossing the barrier and therefore, the estimation of \(\kappa\).

Null steps should accurately reflect crossing behavior, such that each null step has the “potential to cross”. This may be best accomplished by simply sampling from an empirical distribution of step lengths for steps that cross and from a distribution of turning angles observed for any steps proximate to the barrier.

Figure 3: Illustration of two different ways to generate null steps from observed data, a) sampling all observed steps proximate to the barrier and b) sampling only steps that cross the barrier, with each resulting in distinct null probabilities of crossing.
Figure 3: Illustration of two different ways to generate null steps from observed data, a) sampling all observed steps proximate to the barrier and b) sampling only steps that cross the barrier, with each resulting in distinct null probabilities of crossing.

This approach accounts for crossing-specific differences in step lengths, commonly observed for animals interacting with anthropogenic features, e.g., animals “bolting” across a road with larger step lengths to reduce risk of harm. Notably, for datasets with a low number of crossings (\(< 30\)), there may not be enough steps that cross to create an empirical distribution of step lengths to sample from and a simulated distribution may be required (similar to methods used for integrated Step Selection Analyses).

Analysis Steps and Examples

Use of the permeability package assumes that:

  1. the user has familiarity with coding and working with spatial data in R (specifically, with the sf package)

and

  1. has processed and cleaned their data appropriately, including a dataset of movement tracks for their taxa of interest (with some regular fix rate) and a linear barrier (as either an sf LINESTRING geometry type or a matrix of barrier segment locations) that the movement data interacts with.

We demonstrate usage of the package and its associated functions with both simulated and “real” movement data below.

Simulation examples

Null model

Using the simulatePermeability function, a hypothetical movement track (as a correlated random walk, with a length \(N\), turning angles \(\theta\), and step lengths \(L\)) can be simulated with an example barrier (composed of a set of linear segments), with the barrier having a pre-defined \(\kappa\).

The user can mainly define:

  • Shape and scale parameters (step.shape, step.scale) for simulated track step lengths (Weibull distribution, see rweibull function)

  • Rho or concentration parameter (theta.rho) for track turning angles (wrapped Cauchy distribution, see rwrappedcauchy function from the circular package)

  • Track length (N.steps)

  • Bounding box x and y limits for the simulation (xmax, ymax)

  • Number of segments in the barrier (n.segments)

  • Desired permeability of the barrier (kappa)

If the plot.track argument is set to TRUE, a plot of the simulated track and barrier will be returned, with steps that crossed the barrier shown in purple and steps that “bounced” off the barrier shown in orange:

sim_track <- simulatePermeability(xmax = 30, ymax = 30, 
                                  N.steps = 500, theta.rho = 0.5, 
                                  step.shape = 5, step.scale = 10, 
                                  n.segments = 10,
                                  kappa = 0.2,
                                  plot.track = TRUE) 


The output of this simulation is a permdata list object, which contains two elements:

  • track - data frame containing:

  • Start and end locations of each movement step (Z.start,Z.end)

  • Time difference between consecutive locations (D.time)

  • Simulated timestamps for each location (Time)

  • Movement step as a complex location (Step, see as.complex())

  • Step length a.k.a the Euclidean distance between consecutive locations (L)

  • Absolute and turning angles for locations (Phi and Theta)

  • A unique identifier for each step (Step.ID)

  • Whether the step was within the maximum step length of the barrier (In.buffer) - only steps proximate to the barrier (where null crossings are possible) are needed to estimate \(\kappa\) - for simulated data from simulatePermeability, all steps are retained (all simulated steps have In.buffer==TRUE)

  • Distance to barrier for the start location in each step (Dist_toBarrier)

  • Whether the step crossed the barrier or not (Crossed) - NA values for steps where In.buffer==FALSE

  • Simulated unique track identifier (ID)

And:

  • barrier - data frame containing:

  • Complex locations (see as.complex()) for the start and end of each barrier segment (Z1, Z2)

  • Identifier for each line (in sf terms, LINESTRING) in the barrier (line_id) - important for barriers with multiple lines (e.g., a grid-like or network-like barrier) - will be set to 1 if there is only 1 unique line

  • Identifier for each barrier segment (barrier.id)

  • “True” permeability \(\kappa\) values (kappa) for each barrier segment, based on your input parameters

  • If a covariate model is being used, optionally covar values, one for each unique barrier segment (see “Covariate Model” example below)

class(sim_track)
> [1] "permdata"
str(sim_track)
> List of 2
>  $ track  :'data.frame':  499 obs. of  13 variables:
>   ..$ Z.start       : cplx [1:499] 28.8-8.7i 21.5-11.4i 27.5-22.2i ...
>   ..$ Z.end         : cplx [1:499] 21.5-11.4i 27.5-22.2i 29.8-13.2i ...
>   ..$ D.time        : num [1:499] 1 1 1 1 1 1 1 1 1 1 ...
>   ..$ Time          : POSIXct[1:499], format: "2025-02-25 00:00:00" "2025-02-25 01:00:00" ...
>   ..$ Step          : cplx [1:499] -7.3-2.69i 6.02-10.8i 2.33+8.99i ...
>   ..$ L             : num [1:499] 7.78 12.37 9.29 8.67 10.01 ...
>   ..$ Phi           : num [1:499] -2.79 -1.06 1.32 2.67 2.53 ...
>   ..$ Theta         : num [1:499] NA 1.726 2.38 1.348 -0.139 ...
>   ..$ Step.ID       : num [1:499] 1 2 3 4 5 6 7 8 9 10 ...
>   ..$ In.buffer     : logi [1:499] FALSE FALSE FALSE FALSE FALSE FALSE ...
>   ..$ Dist_toBarrier: num [1:499] 28.9 21.5 27.5 29.8 22.3 ...
>   ..$ Crossed       : logi [1:499] FALSE FALSE FALSE FALSE FALSE FALSE ...
>   ..$ ID            : Factor w/ 1 level "SimID_1": 1 1 1 1 1 1 1 1 1 1 ...
>  $ barrier:'data.frame':  10 obs. of  5 variables:
>   ..$ Z1        : cplx [1:10] -0.2-30i 0-24i -0.2-18i ...
>   ..$ Z2        : cplx [1:10] 0-24i -0.2-18i 0-12i ...
>   ..$ line_id   : num [1:10] 1 1 1 1 1 1 1 1 1 1
>   ..$ kappa     : num [1:10] 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2
>   ..$ barrier.id: int [1:10] 1 2 3 4 5 6 7 8 9 10
>  - attr(*, "class")= chr "permdata"
>  - attr(*, "buffer.size")= num 5.37
>  - attr(*, "sample.rate")= num 1

Inmportantly, both track and barrier are in a segmented step format, i.e. each row represents a segment (i.e., a movement step or a barrier segment), and it is in the interaction of the movement segments and the barrier segments that the permeability is analyzed.

head(sim_track$track)
>              Z.start              Z.end D.time                Time
> 1 28.76543- 8.68803i 21.46299-11.37853i      1 2025-02-25 00:00:00
> 2 21.46299-11.37853i 27.48339-22.18237i      1 2025-02-25 01:00:00
> 3 27.48339-22.18237i 29.80944-13.19165i      1 2025-02-25 02:00:00
> 4 29.80944-13.19165i 22.10698- 9.22025i      1 2025-02-25 03:00:00
> 5 22.10698- 9.22025i 13.93393- 3.44549i      1 2025-02-25 04:00:00
> 6 13.93393- 3.44549i 13.14723+ 8.57085i      1 2025-02-25 05:00:00
>                   Step         L       Phi      Theta Step.ID In.buffer
> 1 -7.302441- 2.690500i  7.782315 -2.788587         NA       1     FALSE
> 2  6.020405-10.803844i 12.368037 -1.062407  1.7261804       2     FALSE
> 3  2.326045+ 8.990724i  9.286744  1.317631  2.3800378       3     FALSE
> 4 -7.702458+ 3.971395i  8.666016  2.665542  1.3479110       4     FALSE
> 5 -8.173053+ 5.774759i 10.007329  2.526477 -0.1390653       5     FALSE
> 6 -0.786696+12.016338i 12.042062  1.636172 -0.8903051       6     FALSE
>   Dist_toBarrier Crossed      ID
> 1       28.91244   FALSE SimID_1
> 2       21.46246   FALSE SimID_1
> 3       27.49371   FALSE SimID_1
> 4       29.82373   FALSE SimID_1
> 5       22.27161   FALSE SimID_1
> 6       14.18758   FALSE SimID_1
head(sim_track$barrier)
>                Z1              Z2 line_id kappa barrier.id
> 1 -0.20501621-30i  0.04983172-24i       1   0.2          1
> 2  0.04983172-24i -0.19873887-18i       1   0.2          2
> 3 -0.19873887-18i  0.00952611-12i       1   0.2          3
> 4  0.00952611-12i -0.02178609- 6i       1   0.2          4
> 5 -0.02178609- 6i -0.04682003+ 0i       1   0.2          5
> 6 -0.04682003+ 0i -0.15139704+ 6i       1   0.2          6

In addition to a correlated random walk (type = "crw" in simulatePermeability), users can also specify a biased correlated random walk (type = "bcrw"), including an argument for the target.location (as a matrix of X/Y coordinates) if they want to set the target location themselves (otherwise the package will simulate a random target location).

target <- data.frame(X = -40, Y = 20) |>
  as.matrix()
sim_track <- simulatePermeability(xmax = 50, ymax = 50, 
                                  N.steps = 100, theta.rho = 0.5, 
                                  step.shape = 5, step.scale = 10, 
                                  n.segments = 10,
                                  kappa = 0.2,
                                  type = "bcrw",
                                  target.location = target,
                                  plot.track = TRUE)


The user can also supply a simulated barrier of their own to the simulatePermeability function.

For example, a “grid-like” barrier could be simulated:

xmax <- 50 
ymax <- 50
n.segments <- 10 

barrier_sep <- (1/2)*xmax # spacing between linear features
y <- ymax-barrier_sep # allow spacing around barrier lines

barrier1 <- cbind(X = (xmax-barrier_sep),
                  Y = y * seq(-1,1,length = n.segments+1),
                  line_id = 1)
barrier2 <- cbind(X = (xmax-barrier_sep*2),
                  Y = y * seq(-1,1,length = n.segments+1),
                  line_id = 2)
barrier3 <- cbind(X = (xmax-barrier_sep*3),
                  Y = y * seq(-1,1,length = n.segments+1),
                  line_id = 3)
plot(barrier1, type= "l", ylim = c(-ymax,ymax), xlim = c(-xmax,xmax))
lines(barrier2)
lines(barrier3)


grid.barrier <- rbind(barrier1, barrier2, barrier3)

head(grid.barrier)
>       X   Y line_id
> [1,] 25 -25       1
> [2,] 25 -20       1
> [3,] 25 -15       1
> [4,] 25 -10       1
> [5,] 25  -5       1
> [6,] 25   0       1

The simulated barrier can be used instead of generating a random barrier, by providing it in the barrier argument - n.segments = NULL, as n.segments is now equal to the number of segments present in the user-inputted barrier object.

Importantly, the simulatePermeability function assumes that for a barrier with multiple lines that there is a equal number of segments in each line.

sim_track <- simulatePermeability(xmax = 50, ymax = 50, 
                              N.steps = 500, theta.rho = 0.5, 
                              step.shape = 5, step.scale = 10, 
                              kappa = 0.2,
                              barrier = grid.barrier,
                              plot.track = TRUE) 


The crossing table for the simulated data can be constructed using the buildCrossingTable function, which takes a permdata object (containing simulated movement track and barrier objects) and optionally, the number of null steps to sample around each observed step (default n.null = 60).

sim_cT <- buildCrossingTable(permdata = sim_track)

The resulting object, similar to the track object in permdata, is a “step-wise” dataframe with rows for every observed step in the simulated data that had null crossings with the barrier (e.g., had the “potential to cross”) and with new columns for the:

  • Number of null steps to cross/not cross the barrier (null.crossed, null.stayed)

  • Number of actual steps in the data to cross the barrier (crossed)

  • Barrier id for the barrier segment where the majority of possible steps crossed (barrier.id - corresponds to the same barrier.id in permdata)

  • If the actual step crossed, the barrier segment where it crossed (barrier.crossed) - barrier.crossed = NA if the actual step did not cross the barrier

str(sim_cT)
> 'data.frame': 46 obs. of  10 variables:
>  $ ID             : Factor w/ 1 level "SimID_1": 1 1 1 1 1 1 1 1 1 1 ...
>  $ Step.ID        : num  4 5 10 12 14 16 113 114 115 120 ...
>  $ barrier.id     : num  19 18 28 29 25 23 25 27 28 29 ...
>  $ Z.start        : cplx  -3+19i -3.1+10.5i -24.5+13.3i ...
>  $ Z.end          : cplx  -3.08+10.47i -4.79+7.51i -14.67+20.87i ...
>  $ Time           : POSIXct, format: "2025-02-25 03:00:00" "2025-02-25 04:00:00" ...
>  $ null.crossed   : int  26 20 27 28 19 12 24 24 25 21 ...
>  $ null.stayed    : int  34 40 33 32 41 48 36 36 35 39 ...
>  $ crossed        : num  0 0 0 1 0 0 0 1 0 0 ...
>  $ barrier.crossed: num  NA NA NA 29 NA NA NA 27 NA NA ...

To estimate \(\kappa\) using the compiled crossing table, the fitPermeability function can be used.

If no equation is supplied, a null model is fit:

null_model <- fitPermeability(data = sim_cT)

null_model
> 
> Estimate of kappa (with 95% confidence intervals):
>   kappa.hat    ci.low   ci.high
> 1 0.2446779 0.1017568 0.5883373

The print, summary, and predict functions can be used to examine the output, including the Log-Likelihood, AIC, and \(\kappa\) estimate with 95% confidence interval.

summary(null_model)
> $coef
>    estimate        se   z_score     p_value
> 1 -1.407813 0.4476397 -3.144968 0.001661048
> 
> $logLik
> [1] -14.43865
> 
> $AIC
> [1] 30.87731
predict(null_model)
>   kappa.hat    ci.low   ci.high
> 1 0.2446779 0.1017568 0.5883373

To potentially get more robust estimates (with tighter standard errors), a larger sample size (e.g., 50 tracks) can be used.

The simulatePermeability function has an additional argument available for n.tracks, with the default being 1. The function can also be run with parallel = TRUE (using the future.apply package) to speed up calculations.

library(future.apply)

sim_tracks <- simulatePermeability(xmax = 50, ymax = 50, 
                              N.steps = 500, theta.rho = 0.5, 
                              step.shape = 5, step.scale = 10, 
                              kappa = 0.2,
                              barrier = grid.barrier,
                              n.tracks = 50,
                              plot.track = FALSE, 
                              parallel = TRUE) 

Covariate model

Permeability can also be estimated as a function of covariate(s). Covariates can be specific to the barrier (e.g., one unique covariate value per barrier segment, e.g. unique to each barrier.id), to the movement track (e.g., one unique covariate value for each Step.ID), or the interaction of the barrier and movement track (e.g., specific to crossing table rows). We demonstrate covariate values specific to the barrier here and for the movement track and interaction of the barrier and movement track in the boreal caribou “real data” examples below.

The simulateTrack function will simulate covariate values for you (if no barrier object is specified) or, if a user-provided barrier object is input, will take an argument for covars as a vector of numeric values corresponding to the desired covariate values for each barrier segment (barrier.id).

Additionally, instead of supplying a kappa value, we supply the coefficients defining the (log-linear) relationship between \(\kappa\) and our covariate, with beta storing the desired intercept (\(\beta_0\)) and slope (e.g., \(\beta_1\)) values, respectively.

The plot shows barrier segments with low covariate values in black and barrier segments with high covariate values in green:

sim_track <- simulatePermeability(xmax = 50, ymax = 50, 
                              N.steps = 500, theta.rho = 0.5, 
                              step.shape = 5, step.scale = 10, 
                              n.segments = 15,
                              beta = c(-2,1),
                              plot.track = TRUE) 


Note that the barrier object within the permdata output now also has additional elements for the covariate values and their associated “true” \(\kappa\) values:

head(sim_track$barrier[,c("covar","kappa")])
>      covar      kappa
> 1 2.500740 0.03303250
> 2 2.645828 0.03383445
> 3 2.761895 0.03449000
> 4 5.480341 0.05406118
> 5 6.310057 0.06200998
> 6 9.729632 0.10914329

The crossing table is built the same as for the null model example:

sim_cT <- buildCrossingTable(permdata = sim_track)

After building, the crossing table needs to be annotated with the covariate values. This allows flexibility for covariates, based on whether the covariates are associated with the barrier itself or some other non-barrier (e.g., movement step) specific covariate.

For our simulated example, we have one covariate value for each barrier segment (stored in the barrier object in sim_track), so we can use the barrier.id column in the crossing table to annotate our covariate values to our crossing table.

sim_cT <- merge(sim_cT, sim_track$barrier[,c("barrier.id","covar")],
                by = "barrier.id", all.x = TRUE)

str(sim_cT)
> 'data.frame': 19 obs. of  11 variables:
>  $ barrier.id     : num  1 2 2 4 4 6 7 7 8 8 ...
>  $ ID             : Factor w/ 1 level "SimID_1": 1 1 1 1 1 1 1 1 1 1 ...
>  $ Step.ID        : num  392 363 367 379 384 386 137 109 108 216 ...
>  $ Z.start        : cplx  -2.1-42i -1.6-38.3i -3.7-49i ...
>  $ Z.end          : cplx  -8.9-34i -13-37.3i -3.8-44.8i ...
>  $ Time           : POSIXct, format: "2025-03-13 08:00:00" "2025-03-12 03:00:00" ...
>  $ null.crossed   : int  16 29 5 40 12 37 39 34 20 11 ...
>  $ null.stayed    : int  44 31 55 20 48 23 21 26 40 49 ...
>  $ crossed        : num  0 0 0 0 0 0 0 0 0 1 ...
>  $ barrier.crossed: num  NA NA NA NA NA NA NA NA NA 7 ...
>  $ covar          : num  2.5 2.65 2.65 5.48 5.48 ...

To estimate \(\kappa\) as a function of covar, we supply our covariate column from our crossing table to the formula:

covar_model <- fitPermeability(~ covar, data = sim_cT)

The estimated coefficient values can be extracted with the coef.table element:

covar_model$coef.table
>              estimate        se   z_score    p_value
> (Intercept) -1.581248 0.7986390 -1.979928 0.04771156
> covar        1.196793 0.8500841  1.407853 0.15917465

Estimates may be improved by simulating multiple tracks.

The predict function can be used to predict \(\kappa\) values for each barrier segment and unique covar value.

Be sure to specify se.fit=TRUE to get upper and lower (95%) confidence intervals on \(\kappa\) estimates with the predict function.

predict_kappa <- predict(covar_model, se.fit = TRUE)
predict_kappa$covar <- covar_model$data$covar
predict_kappa <- predict_kappa |>
  dplyr::arrange(kappa.hat) # sort by increasing kappa values

head(predict_kappa)
>    kappa.hat       ci.low   ci.high    covar
> 1 0.02550785 0.0004041883 1.6097707 2.500740
> 2 0.02651591 0.0004425351 1.5887852 2.645828
> 3 0.02651591 0.0004425351 1.5887852 2.645828
> 4 0.05654019 0.0025631688 1.2472034 5.480341
> 5 0.05654019 0.0025631688 1.2472034 5.480341
> 6 0.17593241 0.0314121231 0.9853588 9.729632

A plot can be made of the predicted \(\kappa\) values vs the covariate values for our simulated tracks, with 95% confidence intervals.

plot(predict_kappa$covar, predict_kappa$kappa.hat, type = "l",
     xlab = "Covariate Values", ylab = "Estimated Kappa Values", col ="orange", lwd = 2, ylim = c(0, 2))
lines(predict_kappa$covar, predict_kappa$ci.high, col = "grey")
lines(predict_kappa$covar, predict_kappa$ci.low, col = "grey")


Additionally, if the user wishes to supply their own barrier with covariate values for each segment, this can be done with the covar argument.

n.segments <- 10
barrier <- cbind(X = xmax * rnorm(n.segments+1, sd = xmax / 5e3),
                     Y = ymax * seq(-1,1,length = n.segments+1),
                     line_id = 1)
covar_values <- sort(runif(n.segments, 1, 5))
sim_tracks <- simulatePermeability(xmax = 50, ymax = 50, 
                              N.steps = 1000, theta.rho = 0.5, 
                              step.shape = 5, step.scale = 10, 
                              barrier = barrier,
                              covar = covar_values,
                              beta = c(-2,1),
                              plot.track = FALSE) 

Boreal caribou examples

Boreal caribou (Rangifer tarandus) are non-migratory caribou inhabiting boreal forests in Canada, including areas proximate to human development and major road highways in the Northwest Territories (NWT).

Boreal caribou with GPS collar, photo credit Agnes Pelletier
Boreal caribou with GPS collar, photo credit Agnes Pelletier


In the Northwest Territories, boreal caribou are known to utilize habitat proximate to major highways such as Highway 1 (Figure 3) but may avoid crossing when vehicle traffic is high.


Figure 3. Boreal caribou tracks around southern NWT highways.
Figure 3. Boreal caribou tracks around southern NWT highways.


This example will use GPS tracking data from 10 boreal caribou that encounter Highway 1 West, one of the major NWT highways.

Example data, including a dataframe of caribou locations, Highway 1 West sf object, and daily traffic volume data for Highway 1 West can be loaded using the data function.

data("boreal_caribou")
data("hwy1_west")
data("traffic_data")

Prep data

The permeability package requires movement data to have columns for:

  • ID (factor or character structure)

  • Time (POSIXct, date and time)

  • X and Y spatial coordinates (X and Y, units in meters)

  • Any covariate(s) specific to movement steps (can be numeric or categorical - here, traffic_volume and Season)

str(boreal_caribou)
> 'data.frame': 9522 obs. of  5 variables:
>  $ ID    : Factor w/ 10 levels "ID_1","ID_3",..: 1 4 4 1 4 1 1 4 4 1 ...
>  $ Time  : POSIXct, format: "2015-02-20 17:37:00" "2015-02-20 23:24:00" ...
>  $ X     : num  330632 409718 427748 327542 427499 ...
>  $ Y     : num  6787977 6729661 6736940 6784926 6738101 ...
>  $ Season: chr  "winter" "winter" "winter" "winter" ...

For this example, we will focus on caribou interactions with the western portion of Highway 1.

Barrier data can either be a spatial LINESTRING object (class sf) or a matrix of X and Y coordinates. It should have the same coordinate reference system as the movement data.

str(hwy1_west)
> Classes 'sf' and 'data.frame':    29 obs. of  1 variable:
>  $ geometry:sfc_LINESTRING of length 29; first list element:  'XY' num [1:568, 1:2] 472462 472177 472153 472135 472106 ...
>  - attr(*, "sf_column")= chr "geometry"
>  - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: 
>   ..- attr(*, "names")= chr(0)

X and Y coordinates can be extracted from the sf object and converted to a matrix using the st_coordinates function.

hwy1_west_xy <- st_coordinates(hwy1_west)
head(hwy1_west_xy)
>             X       Y L1
> [1,] 472461.7 6771187  1
> [2,] 472176.9 6771241  1
> [3,] 472152.6 6771245  1
> [4,] 472134.7 6771248  1
> [5,] 472106.4 6771251  1
> [6,] 472086.5 6771252  1

The output includes a third column L1 for the id of the line. If the barrier object has multiple lines (e.g., a grid or network format), this column is especially useful for keeping track of distinct geometric lines. However, note that even features that appear to be one line are often made of multiple lines:

unique(hwy1_west_xy[,"L1"])
>  [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
> [26] 26 27 28 29

If barrier data is being provided as a non-sf object (matrix of barrier segments), permeability functions require the barrier data to contain X and Y coordinates and the third column to be named line_id, even if there is only 1 line in the object.

Alternatively, users can simply provide their barrier data as an sf LINESTRING object to permeability functions.

colnames(hwy1_west_xy) <- c("X","Y","line_id")

Users can also annotate these coordinates with covariate values specific to each segment (see simulation examples above). Just keep in mind that for each line in your barrier coordinates matrix (e.g., each unique line_id), there are \(n\) rows but \(1 - n\) segments.

Additionally, if your LINESTRING object is continuous in space but made of multiple lines (e.g., the hwy1_west example here, which has 29 lines within a continuous feature), you will need to account for the fact that the end of each line has the same spatial point as the beginning of the next line (resulting in duplicates, unless you remove them).

hwy1_west_xy[566:573,]
>             X       Y line_id
> [1,] 394810.9 6777956       1
> [2,] 394765.2 6777979       1
> [3,] 394762.2 6777981       1
> [4,] 394762.2 6777981       2
> [5,] 394714.5 6778006       2
> [6,] 394714.5 6778006       3
> [7,] 394066.9 6778348       3
> [8,] 394015.8 6778374       3

Examples are shown for this with the covariate model examples below.

It’s helpful to check using a basic plot of the barrier coordinates (as complex locations, see as.complex) that the barrier spatial object looks as it should (occasionally, imported shapefiles from ArcGIS can have unordered segments).

Note that if you have multiple lines in your barrier (e.g., multiple unique line_id values), each line should be plotted separately to ensure correct ordering of segments.

z <- hwy1_west_xy[,1] + 1i*hwy1_west_xy[,2]
plot(z, type = "l")


The permeability package requires that movement data have a (mostly) regular fix rate. The fix rate should be checked prior to utilizing permeability functions.

time_steps <- boreal_caribou |> 
  group_by(ID) |>
  arrange(Time) |>
  mutate(dT = c(NA, as.numeric(round(difftime(Time[-1], Time[-length(Time)], units = "hours"))))) |>
  ungroup() |>
  data.frame() |>
  select(dT)
barplot(table(time_steps))


The example data is primarily composed of fixes every 8 hours.

Some irregularity is tolerable, since steps not equal to the fix rate (with a tolerance that can be specified with sample.rate.tolerance in the prepPermeabilitymfunction - default 1 hour) will simply not be used in permeability functions - this ensures that the user does not need to use interpolation (which may result in “fake” regular locations).

The user should ensure that their data is clean (processed to remove any tag errors or other erroneous locations) and mostly regular. We recommend the resample function from the amt package to aggregate any steps with short fix rates less than the desired sample.rate.

Note that if there are many irregular timesteps or large gaps between locations in your data, the user should regularize their data before using the permeability package functions, as otherwise there will not be enough movement steps to estimate \(\kappa\).

After the tracking data has been cleaned, the prepPermeability function is used to further format movement data for permeability package functions, creating a permdata object similar to the simulation examples.

Main arguments to set in this function are the sample.rate (and optionally, sample.rate.tolerance, which defaults to 1 hour around the sample rate) and optionally, the buffer around the barrier (buffer.m). If buffer.m = NULL (the default), prepPermeability will set the buffer to the maximum distance from the barrier observed for steps that cross. The user can also supply their own buffer.m, but should only do so if this distance justifiably will include all steps with the “potential to cross” the barrier (all steps not in the buffer will not be used to build the crossing table).

Setting plot.track=TRUE will also optionally output a plot of each track, including the barrier in red if it’s proximate. As example for one individual:

bor_format <- prepPermeability(move.df = subset(boreal_caribou, ID == "ID_3"),
                         barrier = hwy1_west_xy,
                         sample.rate = 8,
                         plot.track = TRUE)


Covariates can also be annotated to the permdata object. Covariates specific to animal movement steps can be provided with the move.covar.names and move.covar.values arguments. Barrier segment specific covariates can be specified using the barrier.covar.values argument, which should contain a dataframe of covariate values for every barrier segment (e.g., rows correspond to each barrier segment or each unique barrier.id, AFTER accounting for unique lines, e.g., any duplicate segments).

For movement step specific covariates, the prepPermeability function takes a vector with the name(s) of the covariate column(s) in the input movement data (move.covar.names), as well as an argument for how to annotate steps with covariate data (move.covar.values, options being to use covariate values from the start location in the movement step, the end location in the step, or for numeric covariates, optionally the mean or difference of the start and end location covariate values - defaults start).

For our example boreal caribou data, locations have been annotated with 5 seasons (pre-calving Apr, calving May - June, summer Jul - Aug, fall Sep - Nov, and winter Dec - Apr), a movement-step specific covariate.

We also create a barrier-specific covariate, km, representing kilometer markers for every barrier segment along the highway, from the complex locations derived from the barrier X/Y coordinate data and accounting for unique lines.

z <- hwy1_west_xy[,1] + 1i*hwy1_west_xy[,2]

barrier_covar <- cbind(hwy1_west_xy, z) |> 
  data.frame() |>
  group_by(line_id) |>
  reframe(km = Mod(diff(z))) |>
  select(km) |>
  mutate(km = cumsum(km)/1000) |>
  data.frame()

To speed up run time, this function can also be run in parallel, using thefuture.apply package and parallel = TRUE):

bor_format <- prepPermeability(move.df = boreal_caribou,
                         move.covar.names = c("Season"),
                         barrier.covar.values = barrier_covar,
                         barrier = hwy1_west,
                         sample.rate = 8)

The output permdata object is a list with dataframes track and barrier.

class(bor_format)
> [1] "permdata"

track contains the movement data, with rows for every step and including columns for:

  • Start and end locations of the step (Z.start, Z.end, as complex locations, see ?as.complex())

  • Time length of the step (in hours, D.time)

  • Date/time for the first location in the step (Time, POSIXct format)

  • Various movement metrics (step length L in meters, absolute and turning angles in radians, Phi and Theta)

  • Whether the step was within the buffer of the barrier (In.buffer)

  • Unique identifier for the step in each individual track (Step.ID)

  • Whether the step crossed the barrier (Crossed)

  • Any step-specific covariate values (Season)

str(bor_format$track)
> 'data.frame': 9512 obs. of  14 variables:
>  $ Z.start       : cplx  330632+6787977i 327542+6784926i 327254+6784985i ...
>  $ Z.end         : cplx  327542+6784926i 327254+6784985i 328011+6787781i ...
>  $ D.time        : num  14 8 8 8 8 8 8 8 8 8 ...
>  $ ID            : Factor w/ 10 levels "ID_1","ID_11",..: 1 1 1 1 1 1 1 1 1 1 ...
>  $ Time          : POSIXct, format: "2015-02-20 17:37:00" "2015-02-21 08:03:00" ...
>  $ Step          : cplx  -3090-3051i -288+59i 757+2797i ...
>  $ L             : num  4342.6 294.1 2897.1 34.1 39.3 ...
>  $ Phi           : num  -2.362 2.94 1.307 -0.381 -2.903 ...
>  $ Theta         : num  NA 5.3 -1.63 -1.69 -2.52 ...
>  $ Step.ID       : num  1 2 3 4 5 6 7 8 9 10 ...
>  $ Season        : chr  "winter" "winter" "winter" "winter" ...
>  $ In.buffer     : logi  TRUE TRUE TRUE TRUE TRUE TRUE ...
>  $ Dist_toBarrier: num  1702 4966 4928 2085 2095 ...
>  $ Crossed       : logi  NA FALSE FALSE FALSE FALSE FALSE ...

barrier contains the barrier data, with rows for every segment and including columns for:

  • Complex locations (see as.complex()) for the start and end of each barrier segment (Z1, Z2)

  • Identifier for the each barrier segment (barrier.id)

  • Identifier for each line (in sf terms, LINESTRING) in the barrier (line_id) - will be set to 1 if there is only 1 unique line

  • Any barrier-specific covariate values (km)

str(bor_format$barrier)
> 'data.frame': 4784 obs. of  5 variables:
>  $ Z1        : cplx  472462+6771187i 472177+6771241i 472153+6771245i ...
>  $ Z2        : cplx  472177+6771241i 472153+6771245i 472135+6771248i ...
>  $ line_id   : num  1 1 1 1 1 1 1 1 1 1 ...
>  $ km        : num  0.29 0.314 0.333 0.361 0.381 ...
>  $ barrier.id: int  1 2 3 4 5 6 7 8 9 10 ...

The print or summary functions can also be used to get a quick summary of the permdata object, including the final buffer size (Buffer.dist.m), number of ID’s with locations in the buffer (N.IDs.in.Buffer), sample rate and date range of the data (Sample.rate.hrs, Date.range), number of total steps in the buffer (N.steps.in.buffer), the number of steps that crossed the barrier (N.steps.to.Cross), and the mean steplength for steps that crossed the barrier (Mean.crossing.steplength.m).

summary(bor_format)
>   N.IDs.in.buffer Sample.rate..hrs.            Date.range N.steps.in.buffer
> 1              10             7 8 9 2015-02-20-2017-12-31              2549
>   N.steps.to.cross Mean.crossing.steplength..m. Buffer.dist..m.
> 1               26                      6067.32         5199.57

Build crossing table

Null steps for each actual, observed movement step are simulated by sampling observed step lengths and angles from the data.

The buildCrossingTable function will default to creating null steps by:

  1. sampling step lengths for all steps that cross the barrier

  2. sampling all turning angles for steps within the defined buffer distance of the barrier, using n.null as the sample size

Notably, if there are < 30 steps that cross the barrier, a Weibull distribution will be fit to the steps that cross, using the mean crossing steplength as a starting scale value (with the fitdistr function from the MASS package); the rweibull function will then be used to sample n.null number of steplengths from this distribution.

Users can also optionally provide their own vectors of step lengths and angles (with the arguments: step.lengths, turning.angles, absolute.angles) to generate null steps. Users should ensure that these accurately represent steps with the “potential to cross” a given barrier, however, as fitPermeability is sensitive to the null distribution.

The crossing table can be created for all individuals at once using the buildCrossingTable function.

Main input arguments are the:

  • permdata list object with track and barrier dataframes (output from the prepPermeability function)

  • Number of null steps to be simulated for each actual step (n.null, default 60)

Note that this function works by determining all the stepwise crossings for each animal movement step and barrier segment within the buffer and at the desired sample rate, so for large datasets, this function can take 10+ minutes to run.

bor_cT <- buildCrossingTable(permdata = bor_format)

For very large datasets, running in parallel is recommended to significantly improve processing time. This option uses the future.apply package.

require(future.apply)
bor_cT <- buildCrossingTable(permdata = bor_format,
                             parallel = TRUE)

The resulting crossing table, similar to the track object in permdata as output from prepPermeability, has rows for every movement step with null crossings (e.g., includes every step that had the “potential to cross”) and additional columns for:

  • barrier.id (barrier segment where the majority of null steps crossed)

  • null.crossed and null.stayed (number of null steps to cross versus not cross the barrier)

  • crossed (1 if the actual step crossed, 0 if not)

  • barrier.crossed (barrier segment where the actual step crossed - NA if it did not)

str(bor_cT)
> 'data.frame': 2473 obs. of  10 variables:
>  $ ID             : Factor w/ 10 levels "ID_1","ID_11",..: 1 1 1 1 1 1 1 1 1 1 ...
>  $ Step.ID        : num  2 3 4 5 6 7 8 9 10 11 ...
>  $ barrier.id     : num  984 982 993 989 989 989 989 993 989 989 ...
>  $ Z.start        : cplx  327542+6784926i 327254+6784985i 328011+6787781i ...
>  $ Z.end          : cplx  327254+6784985i 328011+6787781i 328043+6787769i ...
>  $ Time           : POSIXct, format: "2015-02-21" "2015-02-21" ...
>  $ null.crossed   : num  7 10 15 25 16 23 22 22 23 25 ...
>  $ null.stayed    : num  53 50 45 35 44 37 38 38 37 35 ...
>  $ crossed        : num  0 0 0 0 0 0 0 0 0 0 ...
>  $ barrier.crossed: num  NA NA NA NA NA NA NA NA NA NA ...

Annotate crossing table

In order to fit permeability models with covariate(s), the crossing table needs to be annotated with any of these variables.

The crossing table has rows corresponding to the interaction of each movement step (identified with Step.ID) and each barrier segment (identified with barrier.id). It additionally contains a Time column (in POSIXct format) that contains the timestamp for the first location in the movement step, which can be used to match with time-varying covariates.

For the caribou data, covariates of interest include season (categorical), traffic volume (numeric), day of year (numeric), and kilometers along the highway (numeric).

Season can be added to the crossing table by simply merging the crossing table with its permdata object, using the matching ID and Step.ID columns.

track_covar <- bor_format$track[,c("ID", "Step.ID","Season")]

bor_cT <- merge(bor_cT, track_covar, by=c("ID", "Step.ID"), 
                    all.x=TRUE)

Kilometer markers along the highway is also available in the permdata object, specifically in the barrier dataframe (km). It can be added to the crossing table using the matching barrier.id values.

barrier_covar <- bor_format$barrier[,c("barrier.id", "km")]

bor_cT <- merge(bor_cT, barrier_covar, by=c("barrier.id"), 
                     all.x=TRUE)

Traffic volume is contained in a separate dataframe, with daily traffic volumes recorded for each unique date. Date can be extracted the from the Time column in the crossing table and used to merge with the traffic data matching Date column.

bor_cT$Date <- as.Date(bor_cT$Time)

bor_cT <- merge(bor_cT, traffic_data, by = "Date", all.x = TRUE) |>
  unique()

Finally, day of year can be extracted using the lubridate package and its yday function on the Time column in the crossing table.

bor_cT$doy <- lubridate::yday(bor_cT$Date)
str(bor_cT)
> 'data.frame': 2473 obs. of  15 variables:
>  $ Date           : Date, format: "2015-02-21" "2015-02-21" ...
>  $ barrier.id     : num  984 982 989 993 989 989 989 993 984 989 ...
>  $ ID             : Factor w/ 10 levels "ID_1","ID_11",..: 1 1 1 1 1 1 1 1 1 1 ...
>  $ Step.ID        : num  2 3 6 4 5 7 8 9 12 10 ...
>  $ Z.start        : cplx  327542+6784926i 327254+6784985i 328004+6787759i ...
>  $ Z.end          : cplx  327254+6784985i 328011+6787781i 328463+6788763i ...
>  $ Time           : POSIXct, format: "2015-02-21" "2015-02-21" ...
>  $ null.crossed   : num  7 10 16 15 25 23 22 22 23 23 ...
>  $ null.stayed    : num  53 50 44 45 35 37 38 38 37 37 ...
>  $ crossed        : num  0 0 0 0 0 0 0 0 0 0 ...
>  $ barrier.crossed: num  NA NA NA NA NA NA NA NA NA NA ...
>  $ Season         : chr  "winter" "winter" "winter" "winter" ...
>  $ km             : num  157 157 159 159 159 ...
>  $ traffic_volume : num  53 53 55 55 55 59 59 59 55 55 ...
>  $ doy            : num  52 52 53 53 53 54 54 54 55 55 ...

Plot crossings

The plotCrossings function can be used with the permdata object to visualize crossings, with the default being a base R plot for all individuals. Setting individual.plots=TRUE will output plots for tracks individually.

plotCrossings(permdata = bor_format)


More advanced options for plotCrossings include an interactive map (interactive.map=TRUE, using the mapview package) or a static map (static.map=TRUE, ggplot2 package, with the option to include a basemap with add.basemap=TRUE using the ggspatial package). Static maps can be easily exported as images.

library(mapview)

Importantly, to use the interactive and static options for plotCrossings, the user also needs to provide the original data and the sf version of the barrier, for spatial mapping.

plotCrossings(permdata = bor_format, 
              original.data = boreal_caribou,
              barrier.sf = hwy1_west,
              interactive.map = TRUE)
crossing_map <- plotCrossings(permdata = bor_format, 
                              original_data = boreal_caribou,
                              barrier.sf = hwy1_west,
                              static.map = TRUE,
                              add.basemap = TRUE)

crossing_map


Null model

The null model can be fit to the crossing table data to estimate \(\kappa\) for the barrier of interest, similar to the simulation example.

null_model <- fitPermeability(data = bor_cT)
null_model
> 
> Estimate of kappa (with 95% confidence intervals):
>    kappa.hat     ci.low    ci.high
> 1 0.04057158 0.02762201 0.05959209

Covariate models

Covariate models can include additive and/or interactive fixed effects and/or optionally, a splined covariate.

First, a model with traffic volume:

traffic_model <- fitPermeability(~ traffic_volume, data = bor_cT)
traffic_model
> Permeability fit call: ~traffic_volume 
> 
>                  estimate        se    z_score   p_value
> (Intercept)    -3.2341552 0.2023819 -15.980457 0.0000000
> traffic_volume -0.3093507 0.2012120  -1.537437 0.1241864
> 
> logLik: -131.8029 
>    AIC: 267.6058

Second, a model with season as a categorical variable:

season_model <- fitPermeability(~ Season, data = bor_cT)
season_model
> Permeability fit call: ~Season 
> 
>                      estimate        se      z_score   p_value
> (Intercept)       -3.27469848 0.2094952 -15.63137717 0.0000000
> Seasonfall        -0.01309219 0.2570990  -0.05092276 0.9593871
> Seasonpre-calving  0.19901172 0.1688073   1.17892832 0.2384267
> Seasonsummer      -0.33995220 0.3118657  -1.09005957 0.2756869
> Seasonwinter      -0.07383229 0.2810450  -0.26270630 0.7927770
> 
> logLik: -130.4724 
>    AIC: 270.9447

Models with interactive and additive effects of season can then be specified:

additive_model <- fitPermeability(~ Season + traffic_volume, data = bor_cT)
additive_model
> Permeability fit call: ~Season + traffic_volume 
> 
>                      estimate        se      z_score   p_value
> (Intercept)       -3.24351343 0.2079164 -15.60008260 0.0000000
> Seasonfall         0.04886779 0.2920432   0.16733070 0.8671098
> Seasonpre-calving  0.25741757 0.2004034   1.28449715 0.1989681
> Seasonsummer      -0.28763592 0.3272580  -0.87892713 0.3794408
> Seasonwinter       0.03767124 0.3609408   0.10436958 0.9168761
> traffic_volume    -0.00920928 0.3202563  -0.02875597 0.9770592
> 
> logLik: -130.5901 
>    AIC: 273.1802
interactive_model <- fitPermeability(~ Season*traffic_volume, data = bor_cT)
interactive_model
> Permeability fit call: ~Season * traffic_volume 
> 
>                                    estimate       se     z_score   p_value
> (Intercept)                      -3.3738979 0.237806 -14.1876067 0.0000000
> Seasonfall                        2.8282825 2.475522   1.1424994 0.2532465
> Seasonpre-calving                 2.4536024 1.751745   1.4006615 0.1613153
> Seasonsummer                      1.1038139 3.059787   0.3607487 0.7182874
> Seasonwinter                      0.4360159 2.996915   0.1454882 0.8843254
> traffic_volume                    1.1409601 1.165786   0.9787049 0.3277258
> Seasonfall:traffic_volume        -2.2562697 1.986646  -1.1357182 0.2560745
> Seasonpre-calving:traffic_volume -1.7571677 1.428196  -1.2303403 0.2185697
> Seasonsummer:traffic_volume      -1.3729088 3.082937  -0.4453250 0.6560849
> Seasonwinter:traffic_volume       0.2784876 2.130988   0.1306847 0.8960247
> 
> logLik: -130.6696 
>    AIC: 281.3392

Models with a spline of a continuous numeric covariate can also be specified, e.g., day of year or kilometers along the highway. The fitPermeability function uses a wrapper function with mgcv functions for fitting smoothers and has support for mgcv arguments (e.g., number of knots, smoother type, etc), using the s function.

Here, we specify number of knots (k) of 4 and cyclic cubic regression spline type (bs = "cc") for day of year as a covariate.

spline_doy_model <- fitPermeability(~ s(doy, k = 4, bs = "cc"), data = bor_cT)
spline_doy_model
> Permeability fit call: ~s(doy, k = 4, bs = "cc") 
> 
> Linear coefficients: 
> 
>            estimate        se   z_score p_value
> Intercept -3.237283 0.2026308 -15.97627       0
> 
> Approximate significance of smooth terms: 
> 
>        k      LRT   p_value
> s(doy) 3 2.363091 0.5005422
> 
> logLik: -131.839 
>    AIC: 271.678

Results include the number of spline parameters (k, also the number of basis functions) and results of a Likelihood Ratio Test (including test statistic and p-value) comparing the model with the splined covariate to a model without it (in this case, just a null model of permeability). A p-value < 0.05 indicates that the model with the splined covariate is significantly improved over one without it.

Spatial variability in permeability along a highway can be determined using a splined covariate of kilometers along the highway:

spline_km_model <- fitPermeability(~ s(km, k = 6, bs = "cs"), data = bor_cT)
spline_km_model
> Permeability fit call: ~s(km, k = 6, bs = "cs") 
> 
> Linear coefficients: 
> 
>            estimate        se   z_score      p_value
> Intercept -3.515376 0.5342036 -6.580593 4.685741e-11
> 
> Approximate significance of smooth terms: 
> 
>       k      LRT    p_value
> s(km) 6 12.73855 0.04738078
> 
> logLik: -126.6513 
>    AIC: 267.3025

Additive effects can also be used with a splined covariate:

doy_traffic_model <- fitPermeability(~ traffic_volume + s(doy, k = 4, bs = "cc"), 
                                        data = bor_cT)
doy_traffic_model
> Permeability fit call: ~traffic_volume + s(doy, k = 4, bs = "cc") 
> 
> Linear coefficients: 
> 
>                  estimate        se     z_score   p_value
> (Intercept)    -3.2416058 0.2038093 -15.9050925 0.0000000
> traffic_volume -0.2778513 0.3619453  -0.7676609 0.4426887
> 
> Approximate significance of smooth terms: 
> 
>        k       LRT   p_value
> s(doy) 3 0.5589067 0.9057718
> 
> logLik: -131.5234 
>    AIC: 273.0469

Model Comparison

Models can be easily compared using the comparePermFits function from permeability, which easily extracts and sorts AIC values for a set of candidate models.

Users can either provide a list of formulas or model objects (as output from fitPermeability) to the function.

bor_models <- list(Null = null_model,
                   Traffic = traffic_model, 
                   Season = season_model,
                   SeasonandTraffic = additive_model,
                   SeasonxTraffic = interactive_model,
                   SplineDoy = spline_doy_model, 
                   SplineKm = spline_km_model,
                   SplineDoyandTraffic = doy_traffic_model)
compare_models <- comparePermFits(models = bor_models)

Or:

bor_formulas <- list(Null = ~1,
                     Traffic = ~traffic_volume, 
                     Season = ~Season,
                     SeasonandTraffic = ~Season + traffic_volume,
                     SeasonxTraffic = ~Season*traffic_volume,
                     SplineDoy = ~ s(doy, k = 4, bs = "cc"), 
                     SplineKm = ~ s(km, k = 6, bs = "cs"),
                     SplineDoyandTraffic = ~ traffic_volume + s(doy, k = 4, bs = "cc"))

compare_models <- comparePermFits(bor_formulas, data = bor_cT)
compare_models
>                 Model  k    logLik      AIC       dAIC
> 1            SplineKm  7 -126.6513 267.3025  0.0000000
> 2             Traffic  2 -131.8029 267.6058  0.3032416
> 3                Null  1 -133.0205 268.0411  0.7385545
> 4              Season  5 -130.4724 270.9447  3.6422041
> 5           SplineDoy  4 -131.8390 271.6780  4.3754633
> 6 SplineDoyandTraffic  5 -131.5234 273.0469  5.7443348
> 7    SeasonandTraffic  6 -130.5901 273.1802  5.8776728
> 8      SeasonxTraffic 10 -130.6696 281.3392 14.0366378

The output includes the number of parameters (k), log Likelihood, AIC, and delta AIC (dAIC) values for each model, ordered by lowest AIC to highest.

Here, there is less than 1 AIC unit difference between a null model of permeability, a model with a spline of km along the highway, and a model with traffic volume as a covariate, suggesting that the spline of km model, despite having the lowest AIC, is not significantly improved over the other two possible models.

Model Predictions

The predict function (with se.fit = TRUE for 95% confidence intervals) can be used with either new or existing data to get predicted \(\kappa\) values as a function of covariates.

We will make predictions for the top-ranking covariate model, with traffic_volume.

predict_kappa <- data.frame(traffic_volume = traffic_model$data$traffic_volume)
prediction <- predict(traffic_model, se.fit = TRUE)
predict_kappa <- cbind(predict_kappa, prediction)

The user can also provide new data with a range of possible covariate values to predict on:

predict_kappa <- data.frame(traffic_volume = 
                         seq(0, 100, by = 1))
prediction <- predict(traffic_model, new.data = predict_kappa, se.fit = TRUE)
predict_kappa <- cbind(predict_kappa, prediction)

Predictions (with 95% confidence intervals) can be plotted to visualize the relationship between \(\kappa\) and the covariate (daily traffic volume):

plot(predict_kappa$traffic_volume, predict_kappa$kappa.hat, type = "l", 
     xlab = "Daily Traffic Volume", 
     ylab = "Estimated Kappa Values", 
     col ="orange", lwd = 2, ylim = c(0, 0.15))
lines(predict_kappa$traffic_volume, predict_kappa$ci.high, col = "grey")
lines(predict_kappa$traffic_volume, predict_kappa$ci.low, col = "grey")


The permeability package also supports boostrapping to get standard errors on estimates, in the case of the Hessian being unable to get standard errors. The fitPermeability and predict functions will give warnings if standard errors are unable to be estimated with the Hessian and will suggest that users use bootstrap = TRUE with the predict function to get standard errors and confidence intervals on estimates.

prediction <- predict(traffic_model, se.fit = TRUE, bootstrap = TRUE)

Bootstrapping can take awhile, so it’s also suggested that users run the predict function with parallel = TRUE (using the future.apply package) to speed up run time.

prediction <- predict(traffic_model, se.fit = TRUE, bootstrap = TRUE, parallel = TRUE)

Map Predictions

Predictions from fitted models can also be mapped to sf barrier segments, using the mapPredictions function.

This function takes a dataframe of predictions, including a barrier.id value for each value so that it can be mapped to barrier segments, and the original sf barrier object used as input to the prepPermeability function.

The function can also handle multiple linear features; for example, if you had three highways of interest occurring in the same spatial area, you could bind them together into one large sf object, including a column Feature.ID to distinguish them. The predictions dataframe should also have a column called Feature.ID, to map individual linear feature predictions to their own geometry.

Finally, this function has options for creating a static map without a basemap (default), with a basemap (using ggspatial if maptype is specified or basemaps with an ESRI landscape map, by default), or to create an interactive map (with mapview and interactive = TRUE). Users can optionally add an sf POINT object of their movement data to map movement tracks with the barrier(s) (using the tracks.sf argument) or even add their own basemap (provided to the basemap argument).

Here, we show an example predicting \(\kappa\) values as a function of km along the highway, using the fitted spline_km_model from above. First, a predictions dataframe is created, storing km and corresponding barrier.id values for each barrier segment.

hwy1_west_z <- hwy1_west_xy[,1] + 1i*hwy1_west_xy[,2]
predictions <- cbind(hwy1_west_xy, hwy1_west_z) |> 
  data.frame() |>
  group_by(line_id) |>
  reframe(km = Mod(diff(hwy1_west_z))) |>
  select(km) |>
  mutate(km = cumsum(km)/1000) |>
  mutate(barrier.id = 1:length(km)) |>
  data.frame() |>
  subset(barrier.id %in%  # make predictions for barrier segments in crossing table (crossings possible)
           seq(min(bor_cT$barrier.id), max(bor_cT$barrier.id, by = 1)))

Predicted \(\kappa\) values are added to the dataframe, using the predictions dataframe as a new data input.

predictions$kappa.hat <- predict(spline_km_model, se.fit = FALSE, 
                                 new.data = predictions)[,1]
str(predictions)
> 'data.frame': 3271 obs. of  3 variables:
>  $ km        : num  0.29 0.314 0.333 0.361 0.381 ...
>  $ barrier.id: int  1 2 3 4 5 6 7 8 9 10 ...
>  $ kappa.hat : num  0.142 0.142 0.142 0.142 0.142 ...

A basemap can be added with add.basemap = TRUE.

boreal_caribou_sf <- boreal_caribou |>
  st_as_sf(coords = c("X","Y"), crs = st_crs(hwy1_west))

map <- mapPredictions(predictions, barrier.sf = hwy1_west,
                      add.basemap = TRUE,
                      tracks.sf = boreal_caribou_sf)
map


Conclusions and Contact Information

We present a simple and fast approach for estimating permeability of linear barriers for large biotelemetry tracking datasets. Our approach provides similar flexibility to other statistical model packages and functions. Users can fit either univariate or multivariate models and can perform model selection to estimate barrier permeability, with the option to explore covariates specific to either the movement steps of the animal or the barrier.

This tool especially serves as a useful method for quantifying permeability with respect to a barrier itself, with broad applications for management of taxa interacting with anthropogenic infrastructure (or even natural features, such as rivers).

If you use our package in publications, please be sure to cite it as:

Note: will be available upon publication on CRAN

citation("permeability")

Report any issues or bugs to the package developers:

(Anonymous)

Acknowledgements

Boreal caribou tracking data used in this vignette was provided by the GNWT and made possible by the support of Indigenous Government Organizations and management authorities for boreal caribou, including the Sambaa K’e Dene Band, Fort Simpson Métis Local, Łı́ı́dlıı Kų́ę́ Kue First Nation, Tthets’ék’ehdélı First Nation, Pehdzeh Ki First Nation, Nahɂą Dehé Dene Band, Acho Dene Koe Band, Ka’a’gee Tu First Nation, Deninu K’ue First Nation, NWT Metis Nation, Hay River Métis Council, Fort Resolution Métis Council, K’atlodeeche First Nation, West Point First Nation, Deh Gáh Got’ıę First Nation, and Fort Providence Métis Council.